SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

■> ADDTTTANT- Palese, Peter 
,n APPLICANT. j^^bert 

tic;k of antiviral 

.iTLE OF INVENTION: II^ENTIFIca-w. ...^ ^^^^ PROTEIN. 

hrZl ^SeS^ ^R^U^TfEf foR VIRAL R.P.XCATION 

(ill) NUMBER OF SEQUENCES: 20 

(iv) CORRESPONDENCE ^^DRESS: 

Tel SrETf "BrAveLe of the A.e..cas 

C CITY: New York 
D) STATE: New York 
(^) COUNTRY: USA_ 
|f) zip: 10Q36-z/1-^ 

\c) CLASSIFICATION: 



ATTORNEY/AGENT INFORMATION^ 
^'^^ ' NAME: Coru^zx, Laura A 

(B) REGISTRATION^....B H^^^^ 

(C) REFERENCiL/ i^Uv^r.^- 

TELECOMMUNICATION information: 
!c) ?ELEX: 66141 PENNIE 



NO: 1: 



(2) 



information for seq id 

,i) SEQUENCE CHARACTERISTICS: 
* ' (A) LENGTH: 19 base pairs 
B TYPE: nucleic acid 
C STRANDEDNESS: single 
D) TOPOLOGY: linear 



MOLECULE TYPE: DNA 



SEQUENCE DESCRIPTION: SEQ ID N0:1: 
GCAAAGCAGG AGAAACCAC 
(2) INFORMATION FOR SEQ ID NO : 2 : 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base p^ir= 

(B) TYPE: nucleic aci^ 
ic) STRANDEDNESS: single 



19 
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(D) TOPOLOGY: linear 
^i) MOLECULE TYPE: DNA 



n^c^cRTPTION: SEQ ID NO: 2: 

(xi) btyuE-i^v-- 

GGGTCCATCT GATRGATATG AGAG 
(2) INFORMATION FOR SEQ ID NO: 3: 

(1) SEQUENCE CHARACTERISTICS: 
^ ' (A) LENGTH: 48 base parrs 
B) TYPE: nucleic acrd 
C STRANDEDNESS: single 
(D) TOPOLOGY: linear 

iii) MOLECULE TYPE: DNA 

^'""^ "^^Se/KEY: modified^base 

[l\ ^o?HfR'fNFO^TXON: /.od_base= i 

^^^^ "^J7Se/KEY: .odified_base 

\l] SSe"?nFO^TION: /.od_base= i 

"^^Se/KEY: modified_baee 

(B) LOCATION: 41 _^ ^ase^ i 

(D ) OTHER INFORMAxxOi.. / 

^^^S'aME/KEY: modified^base 
\l\ SHER^fN^oiSATION: /.od_base= i 

'^'^^ '^rSE/KEY: raodified.base 

^o?SR'fN^oJSATION: /n>od_ba.e= i 

^^^r^E/KEy: .odified.base 

SHER^fNFORiATXON: /.od_base= i 

SEQUENCE DESCRIPTION: SEQ ID NO:3: 

.r^r-r PTCGACTACT ACGGGNNGGG NNGGGNNG 
CUACUACUAC UAGGCCACGC GTCGACTACl 

(2) INFORMATION FOR SEQ ID NO: 4: 

,i> SEQUENCE CHARACTERISTICS: 
^ ' (A) LENGTH: 20 base paxrs 
(B) TYPE: nucleic acxd 
C STRANDEDNESS: single 
(D) TOPOLOGY: linear 

/, i ^ MOLECULE TYPE: DNA 

\ / 
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# 



(X.) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
TCCTGATGTT GCTGTAGACG 
(2) INFORMATION FOR SEQ ID NO: 5: 

il) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 20 base parrs 

(B) TYPE: nucleic ac:Ld 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 
GCACGACTAG TATGATTTGC 
(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STPJVNDEDNESS: 

{ D ) TOPOLOGY : unknown 

(ii) MOLECULE TYPE: peptide 

SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

Thr Gly Ala Gly Ala Gly Leu Gly 
1 ^ 
(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 5 amino acids 



^ ri ; j^i-"- 

(B) TYPE: amino acra 

(C) STRANDEDNESS: 

( D ) TOPOLOGY : unknown 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7 
^-■r- ser Ala Ala Lys 

(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base parrs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
, rnr>r^r\T r\n.v ' nnknown 

{ U ) X v./x' • 
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MOLECULE TYPE: DNA 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1 . . 27 

„ ^^rr^T ^XT . OTTO TO MOtft: 

(xi) SEQUENCE ue::>l,k± x ± w.. . t>^^ -■ ■- 

GAG TGG CTG GAA TTC CCC ATG GCG TCC 
ASP Trp Leu Glu Phe Pro Met Ala Ser 
1 ^ 

(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acxds 

(B) TYPE: amino acid 
(D) TOPOI;OGY: linear 

(ii) MOLECULE TYPE: protein 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 
Asp Trp Leu Glu Phe Pro Met Ala Ser 



1 5 



(2) INFORMATION FOR SEQ ID NO: 10: 

fi) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2940 base parrs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: cDNA 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 47,. 1663 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
CTAACTTCAG CGGTGGCACC GGGATCGGTT GCCTTGAGCC TGAAAT ATG ACC ACC 



1 



CC. 0C» <=»0 «.C .TX C.C CT. ^ .CT T.C ^ TCT CT= 

pro Gly Lys Glu Asn Phe Arg Leu Lys Ser lyr 



S 10 



- 5- -I S o?S tT. - S= ill 

20 

^. i% iT. ir. Ill iT^ s ^ 1% if. -I fi 

S IT. - tT. ^ ™^ l-T. if, fJ 'It s2 1% fi III 
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103 



151 



199 



247 
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«° ne s sf„ s« =c;s s;? IT. 

70 '^^ 
TCT =»C .TC «TT 0»a TTT .CC ... j=C CC. ..O C.C 

Thr Ser Asp Met lie Giu neu x 

8b 

T^nr- rTr, CTT TCA AAA GAA CCT AAC 
Zl l?r Ma fh^ G?n L^s Phe T.g Lys Z^u Leu Ser .ys CIu Pro ^.n 
100 

CCT CCT ATT O.T GAA OTT ATC AOC ACA CCA OCA CTA OTO OCC AGO TTT 
Pro Pro lie Asp Glu Val He ser Thr Pro Giy 
120 

0.0 rrc CTc c<=. .A. a»= ..T t=t tc. cto c.o ttx tc. 

Val Glu Phe Leu Lys Arg Lys Glu Asn cys ^ 
135 

ACA AAT ATT GCT TCA GGA AAT TCT CTT CAG ACC CGA 
^la Trp i:;u Thr A.n He Ala Ser Gly Acn Ser Leu ..n . 

150 ^-^^ 

51? ?f. ^fn Jf. - ^^^o - i 

^1 S r.^ 5| S f.5 s g 

Z OCT a=. G.T .OT .CC .TO TGC .0= 3.C T.T =TC TT. JAC T=C J.T 

lie Ala Gly Asp Ser Thr Met Cys Arg Asp ly ^iq 
200 ^"-'^ 
.TT CCC CCT CTT TTG CAG TTA TTT TCA AAG CAA AAC CGC CTG ACC 
ill Leu pro pro Leu Leu Gin Leu Pne ^er .y. ^^^^ A-^- -^--9 ^ 

r S S fa^ S TeS sfr ^^n fr^ S 

Met Thr Arg Asn Ala vai irp 

230 ^-^^ 

TVTsr^ rTT TPT CCA TGT CTG AAT GTG CTT 
ill ^T. S ?al ?S eye L,„ Val .,u 

.CC Z TT= CTC TTT =TC .OT O.C .CT CAT CT. CT3 =CT =.T CCC TCC 

ser Trp Leu Leu Phe Val Ser Asp mr asp ^75 
260 265 

TOG GCC CTC TCA TAT CTA TCA GAT GGA CCC AAT GAT AAA ATT CAA GCG 
Trp Ala Leu Ser Tyr Leu Ser Asp Gly Pro Asn Asp y 

280 '^^^ 
GTC ATC GAT GCG GGA GTA TGT AGG AGA CTT GTG GAA CTG CTG ATG CAT 
val He Asp Ala Gly Val Cys Arg ary . 
295 

-1 1% ^ i I- 1- s f.i i 

Z. ^11 ill S?, ?S ill "I ^f„ Sf. 

vax i^iij- "-r- jjuj 



295 



343 



391 



439 



4S7 



535 



583 



631 



679 



727 



775 



823 



871 



967 



1015 



1063 
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s sf„ ™ I- 1 S - - i 

S s SI s ?s -r. ^ ?s s s S - 

.^r. rp^r^ rrA GCC CTC ATT AGT 

Sf„ ?| 'I^ Is r.; ^i; ... ... se. 

1 OCT ... TTT C3C =f OCT TC= =CC 

lie Leu Gin Thr Ala Glu Pne arg ^ 

„c .c. I cc. .cx CO. .0. =c. - - 

lie Thr Asn Aia Thr Ser Gly Gly Ser Ai 

... ... ..r, r.TC TGT GAT CTC CTC ACG GTC ATG 

GTA GAA CTG GGT Tur ... ^^^^ Leu Leu Thr v.. ..^^ 

V^^il Glu Leu GiV G y AriQ 

420 



435 



1111 



1159 



1207 



1255 



1 303 



1351 



-I I- S ill s s ^ I- 

440 

- - s s?; - - - ii ™' "° 

s :: 0. 

470 

...A CAG GAG ATC TAC CAA AAG GCC TTT GAT CTT 

TTA uAtj rt-^j- — - - , Pi Tig Tvr Gin uya "-^^ ---t 

Leu Gin ser His Glu Asn Gin Glu lie y 

s 5 sf. S5 s s S f/. s FJ 

500 

sf„ 5n =.f. |s„ -I 1 - - - - S 

ec, ... C«V Z CO CXT TO. .OC..T.CTC TOCTTTC.CO T.CCTOTOCT 

Pro Met Glu Gly Phe Gin Leu 

O.O.OC.OOC xloLoTCO .OTOCXCXTO XOO.OCOC.C .O.CCTC.TO O.OCT..C.. 
CTC....OT. .TCC.T..T. CTOTTTOCOC TC....OC.X OCC.TOCOC. OC.OC.OTC. 
..C.C.C.XC TOO....CCT CCOOC.C.CT OTOO.OOO.. .CCC^.. T..«OOOT. 
.CC.O..C00 CCC.C.CXC. .T..OOO... ...COCT.OO CX™o.O.. CCOC.C.X.C 

....O.0TX. .000..T... C.C...TT.. .OTOOC.CCC .T.TTCTTOX OO000..T.. 

..0.0O.0.C CTC0TC...C CCTTT..C.X OOOOO...^ .CXO.C.T,. ...O.TO.O. 

C....XCTXT ..CTTO..TT TT.C.O..C. .CTT.OO.C. .OOO.O.TOT XT.O.CCTOT 

TOOT.T.CTT C.O.OX.CTT XTC.TO.OTT CT.CC.C.OT 0..CC0TTOO .TT.CC.OO, 



1399 

1447 

1495 

1543 

1591 

1639 

1693 

1753 

1813 

1873 

1933 

1993 

2053 

2113 

217 3 
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GGCTTTTTCT 
TTGGCGCCAT 
TGAATGTGCT 
ATACTGGTGG 
GGCACATTTA 
AAACCATATC 
CAAATGCAAT 
TGTGGTCCTG 
GTGACTGGCA 
GCTCGATTCT 
AAAGAAATAA 
AACCTCATTG 
CTCCCCCCAC 



AGCCAGATTG 
TCTTCAGATA 
TTCTAGTTGT 
GTAAGAGCAG 
ATTTGTTCGC 
CTGTAATTTA 
CAGTGTAACT 
ATCAAGGTCC 
GATAACACAT 
AATCTTTTCA 
CCATCTTCTC 
TTTCCTAATC 
ACAAAATAAA 



CATTAATCCT 
TTAAAGTTAA 
CAGGAATGCT 
GGCACATTTA 
TTTTGCTTCT 
ATAAAAAAAA 
AGGGGCTGTG 
TCATTAAAAA 
ACTTTTGAAA 
TGCTGCACAC 
A.ACATGACCT 
TATTTTATTT 
AACAGTATCT 



TACTGAGATT 
ACCATCCACT 
GAAGAATTAA 
ATTTGTTCGC 



CTAAGGACGA 
TTTCTGCATT 
ATTGGAGTTC 
GTAACTTTGG 
GATTCCTTTA 
GCTTAACCCA 
CATCTCCTGC 
CGCTTCTGGC 



GGATGGTTTT 
CCCTCACCTT 
CACTTTGACT 
TTTTGCTTCT 
TTCGAATACT 
AAAAACCCCT 
AAAATAAATG 
ACCCTAGGCT 
GATTTTTTTT 
ATCGATAGCA 
AATAAGAACA 
TAGTACTGTG 
TCATTTT 



CTTTCCTCTA 
CAGCCTTCAG 
CCTAA.ATGTG 
CTTTGGTCTG 
TAGTAATCGA 
CCAATTTTCC 
TTTCAGGCTT 
TTTCCCCTCT 
CTTAGGTGCA 
TCCTTATCTG 
GTGATCTTAT 
CCGCTTCCCC 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 539 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(li) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Met Thr Thr Pro Gly Lys Glu Asn Phe Arg Leu Lys 
1 5 10 

Lys Ser Leu Asn Pro Asp Glu Met Arg Arg Arg Arg 
20 

Leu Gin Leu Arg Lys Gin Lys Arg Glu Glu Gin Leu 
35 

Asn Val Ala Thr Ala Glu Glu Glu Thr Glu Glu Glu 
50 

Gly Gly Phe His Glu Ala Gin He Ser Asn Met Glu 

6 5 

Gly val He Thr Ser Asp Met He Glu Met He Phe 

85 

r,iu cm Gin Leu Ser Ala Thr Gin Lys Phe Arg Lys 
100 -^^^ 

Glu Pro Asn Pro Pro He Asp Glu Val He Ser Thr 
115 120 

Ala Arg Phe Val Glu Phe Leu Lys Arg Lys Glu Asn 
130 ^-^^ 

. ^r-r. v^l Leu Thr Asn He Ala Ser 

Fne LjIU i-ixu. ' — ice; 

145 



ser Tyr Lys Asn 
15 

Glu Glu Glu Gly 
30 

Phe Lys Arg Arg 
45 

Val Met Ser Asp 



Met Ala Pro Gly 

80 

Ser Lys Ser Pro 
95 

Leu Leu Ser Lys 
110 

Pro Gly Val Val 
125 

Cys Ser Leu Gin 



Gly Asn Ser Leu 



2:: 3 3 

::l^9 3 
^53 
:!4i3 

2473 

2533 

2593 

2653 

2713 

2773 

2833 

2893 

2940 
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om Thr Arg He Val He Gin Ala Arg Ala Val Pro lie Phe lie Glu 

165 

^.e. l,c. Ser Ser Glu Phe Glu Asp Val Gin Glu Gin Ala Val Trp Ala 

180 

... Tie Ala Gly Asp Ser Thr Met Cys Arg Asp Tyr Val Leu 

T^o T^r Pro pro Leu Leu Gin Leu Phe Ser Lys Gin Asn 
Asp Cys Asn lie Leu Pro Pro i^eu 

210 

7M w^i Tr-T^ Ala Leu Ser Asn Leu Cys 
Arg Leu Thr Met Thr Arg Asn Ala Val Trp Ala i.eu 

225 230 

Phe Ala Lvs Val Ser Pro Cys Leu 
Arg Gly Lys ser Pro Pr^ P^- - - 255 

245 -^^^ 

^sn val Leu Ser Trp Leu Leu Phe Val Ser Asp Thr Asp Val Leu Ala 

260 

Ala cys Trp Ala Leu Ser Tyr Leu Ser Asp Giy Pro A.n Asp Lys 

275 



lie Gin Ala Val He Asp Ala Gly Val Cys Arg Arg Leu Val Glu Leu 

290 

TT^T qpr Pro Ala Leu Arg Ala Val 

Leu Met His Asn Asp Tyr Lys Val Val Ser Pro a y 

305 310 

Z Asn lie val Thr Gly Asp Asp He Gin Thr Gin Val He Leu Asn 

325 ^-^^ 
cys ser Ala Leu Gin Ser Leu Leu His Leu Leu Ser Ser Pro Lys Glu 

340 -^^^ 
... ... T.y. LVS Glu Ala cys Trp Thr He Ser Asn He Thr Ala Gly 

355 

.sn Arg Ala Gin He Gin Thr Val He Asp Ala Asn He Phe Pro Ala 

370 

Leu He ser He Leu Gin Thr Ala Glu Phe Arg Thr Arg Lys Glu Ala 
385 390 395 

Ma Trp Ala He Thr Asn Ala Thr Ser Gly Gly Ser Ala Glu Gin He 



405 



Lys Tyr Leu Val Glu Leu Gly Cys He Lys Pro Leu Cys Asp Leu Leu 

^ 420 426 

Thr val Met Asp Ser Lys He Val Gin Val Ala Leu Asn Gly Leu Glu 

435 

r..,, rip G-i- Ala Lvs Ara Asn Gly Thr Gly 
Asn He Leu Arg Leu Gxy Gxu Gxn C A.a i.y 

450 

Xle Asn Pro Tyr Cys Ala Leu He Glu Glu Ala Tyr Gly Leu A.p Lys 

465 "^"^^ 

lie Glu Phe Leu Gin Ser His Glu Asn Gin Glu He Tyr Gin Lys Ala 

485 

rin Hi=^ Tvr Phe Gly Thr Glu Asp Glu Asp Ser Ser 
Phe Asp Leu He Gxu H^s x^r t-^.e ^xy 

500 

lie Ala pro Gin Val Asp Leu Asn Gin Gin Gin Tyr He Phe Gin Gin 
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515 



Cys Glu Ala Pro Met Glu Gly Phe Gin Leu 



r n r\ 5 3 5 

•J) J 



(2) 



INFORMATION FOR SEQ ID NO: 12: 

(x) SEQUENCE CHARACTERI STICS : _ ^ 

(A) LENGTH: 542 amj-n^^ ac:L^o 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

. ... .H. ... ser Ser Thr Ser Lys Phe Val Pro Glu Tyr 

Met ASP vj^y ---ir 

.r. Th. Phe L.s .sn Lys Cly .r, Phe Ser Ma .ep Olu Leu 

A., =ln Cl„ Val Glu Leu L,. .r, 

..p ol„ 1 Leu M. Lys «g »» "<> "° ^""^ 

50 

^sp Se. .sp Olu Olu ^.p Olu Ser Ser Val Ser .la .ep Oln OXn 

65 '^^ 

-n^j- rnu^ r;T n Gin Leu 
_. n^r. T.^u Gin Gin Glu Leu Pro uin Met xi.. 

se. ..p «P Met cm .lu 01. Leu Ser T.r Val Lys P.e 

100 

Oln lie Leu Ser .r, =lu Hi. Ar, Pro He ».p Val Val He Gl„ 

115 -^^^ 

fT.i Phe Met Arg Glu Asn Gin Pro 
Ala Gly val Val Pro Arg Leu Val Glu Phe Met ^^g 

130 

Olu Met Leu Gin Leu Glu Ala Ala Trp Ala Leu Thr Asn He Ala Ser 

145 

, ^, ^ -a^ ^^aT Val ASD Ala Asp Ala Val Pro 

Gly Thr ser Ala Gin Thr i^y^ - 175 

165 

Leu Phe lie ain Leu Leu Tyr Tl,r Oly ser Val Olu Val Lys Olu Oln 

180 

.1 v^l Ala Glv Asp Ser Thr Asp Tyr Arg 

Ala lie Trp Ala Leu Gly Asn Val Ala Gly P 

19~5 

Tyr val Leu Oln Cys »s„ »la Met Olu Pro lie Leu Oly Leu Phe 

210 

Asn ser Asn Lys Pro Ser Leu He Arg Xhr Ala Thr Trp Thr Leu Ser 

225 '^^^ 

.... ... r,lv Lvs Lvs Pro Gin Pro Asp Trp Ser Val Val Ser 

rtt5ii x.^^ -j.^ . - 250 "-^-^ 
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. T -^/c. Leu Tyr Ser Met Asp Thr Glu 

Gin Ala Leu Pro Thr Leu Ala ..ys Leu .^e y 

2 60 ^^-^ 
... ..u Val ASP Ala Cys Trp Ala He Ser Tyr Leu Ser Asp Gly Pro 

275 

Cln Olu Ala He Gin Ala Val He Asp Val Arg lie Pro Lys Arg Leu 

290 

val Clu Leu Leu Ser Hie Glu Ser Thr Leu Val Gin Thr Pro Ala Leu 

305 -^^^ 

.rg Ala Val Gly Asn He Val Thr Gly Asn Asp Leu Gin Thr Gin Val 

32 5 

val He Asn Ala Gly Val Leu Pro Ala Leu Arg Leu Leu Leu Ser Ser 

340 

P.o Lys Olu Asn He Lys Lys OIu Ala Cys Trp Thr He Ser Asn He 

355 -^^^ 

rri' . J^P nlT^ Ala VaJ- XJ-t; rtoj^ r.^-^ 

Thr Ala Gly Asn Tnr Giu G^n He Gi.. ax 

370 -"'^ 
lie pro pro Leu Val Lys Leu Leu Glu Val Ala Glu Tyr Lys Thr Lys 

385 

.ys Glu Ala cys Trp Ala He Ser Asn Ala Ser Ser Gly Gly Leu Gin 

^ 405 
.rg pro Asp He He Arg Tyr Leu Val Ser Gin Gly Cys He Lys Pro 
420 '^^^ 
.vs .,P I..U I.=u Olu Ue ..P »n He Ilj Clu Val T., 

435 

..u ASP Ala Leu Glu Asn He Leu Lys Me. Gly Glu Ala Asp Lys Glu 
" 450 4bb 

.la Arg Gly Leu Asn He Asn Olu Asn Ala Asp Phe He Glu Lys Ala 

liy Gly Met Glu Lys He Phe Asn Cys Gin Gin Asn Glu Asn Asp Lys 

^ 485 

Ti<= r^u Thr Tyr Phe Gly Glu Glu 
lie Tyr Glu Lys Ala Tyr Lys xle lie G.a Thr ly 

500 

Olu ASP Ala Val Asp Glu Thr Met Ala Pro Cln Asn Ala Gly Asn Thr 

515 

Phe Gly Phe Gly Ser Asn Val Asn Gin Gin Phe Asn Phe Asn 

530 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 170 base pa^rs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

(li) MOLECULE TYPE: DNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 3 : 
OGAGGCACCG AAGGGC.GCG CCGAGTCGGA GGGGGCGAAG ATTGACGCCA GTAAGAACGA 
GGAGGATGAA GGCCATTCAA ACTCCTCCCC ACGACACTCT GAAGCAGCCA CGGCACAGCG 
GGAAGAATGG AAAATGTTTA TAGGAGGCCT TAGCTGGGAC ACTACAAAGA 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1827 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA 

( ix ) FEATURE : 

( A i N AI^K / KEY : CD S 

{B) LOCATION: 1..1352 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

f.'i i^i ^- »^ - - - "1 



10 

1% sf. S III ?fo 1% -I I'l - If. 
s?, ^l s ^i ^i - =1 - - - ^ - 

i S »n f,^ If. ^» =tS S 

^ =.f. ^fa S 5 S O^S Sf„ ?i 

i ^i iii s i s s= ?i ^ ?s 
=.15 s s r.i III IT. Ill '^i ^ 



100 



^11 III ?n =»f, fe^. -V si ?S5 - 
s s jr, ?s sf. Ill III ^ III 

130 

CAC ATC AGT GAA CAA GCT GTC TGG OCT CTA GGA AAC ATT GCA GGT GAT 
His He Ser Glu Gin Ala Val Trp Ala Leu Gly Asn He Ala y ^^y 



145 



150 155 



60 
120 
170 



48 



96 



144 



192 



240 



288 



336 



384 



432 



480 



GGC TCA GTG TTC CGA GAC TTG GTT ATT AAG TAC GGT GCA GTx GAC CCA 
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Gly ser Val Phe Arg Asp Leu Val He Lys Tyr Gly Ma Val Asp Pro 

^ ---- 170 X I ^ 



165 



CTG TTG GOT CTC CTT GCA GTT GOT GAT ATG TCA TCT TTA GGA TGT GGC 

Leu Leu Ala Leu Leu Ala Vai Pro Asp Met S.r ..er .e- A.a y- . 
180 

, _,_ „ ~^„, »,»m /-iT^rp rnrLn CQC AAG AAG 

TAP TTA CGT AAT CTT ACC TGG ACA Ci i iCi x 1.:-^ -'-^ -- - 

TAG TTA GGi ^^^^ ^ ^^^^ 

ryr ^eu n.y -^^ 205 

- P^o ^fo 5al - - i - - 

210 215 220 

CAT CAT GAT GAT CCA GAA GTG TTA GCA GAT ACC TGC 
val Arg ifu l.u His His Asp Asp Pro Glu Val Leu Ala Asp Tnr cys 

TGG GCT ATT TGC TAG CTT ACT GAT GGT CCA AAT GAA CGA ATT GGC ATG 
Vtl iTe ser Tvr Leu Thr Asp Gly Pro Asn Glu Arg He Gly Met 

- - t „ , ,- V R ri ^: . j J 



245 



CTG GTG AAA ACA GGA GTT GTG CCC CAA CTT GTG AAG CTT CTA GGA GCT 
5Il vl? rlr Gly val Val Pro Gin Leu Val Lys Leu Leu Gly Ala 

260 265 

TCT GAA TTG CCA ATT GTG ACT CCT GCC CTA AGA GCC ATA GGG AAT ATT 
ler llu Pro iL Val Thr Pro Ala Leu Arg Ala lie Gly Asn He 



275 



5S ^f. S III - ?S r„ III ^11 1% IT. IT. 

290 295 300 

r.C GCC GTG TTT CCC AGC CTG CTC ACC AAC CCC AAA ACT AAC ATT CAG 
£eu Ala val Phe Pro Ser Leu Leu Thr Asn Pro Lys ah. A... ..e ...^ 
305 310 

AAP GAA GCT ACG TGG ACA ATG TCA AAC ATC ACA GCC GGC GGC CAG GAC 
G?^ Ill S Trl Thr Met Ser Asn lie Thr Ala Gly Arg G n Asp 
^ 325 330 -^-^^ 

GAG ATA CAG CAA GTT GTG AAT CAT GGA TTA GTC CCA TTC CTT GTC AGT 
Gin 111 Gin oTn Val Val Asn His Gly Leu Val Pro Phe Leu Val Ser 
340 345 

GTT CTC TCT AAG GCA GAT TTT AAG ACA CAA AAG GAA GCT GTG TGG GCC 
Zl Leu ser Ala Asp Phe Lys Thr Gin Lys Glu Ala Val Trp Ala 

355 360 365 

GTG ACC AAC TAT ACC AGT GGT GGA ACA GTT GAA CAG ATT GTG TAC CTT 
val Thr Asn Tyr Thr Ser Gly Gly Thr Val Glu Gin Ixe v.x x,. x.c. 
370 375 380 

GTT CAC TGT GGC ATA ATA GAA CCG TTG ATG AAC CTC TTA ACT GCA AAA 
5al Sis ?ys Gly He He Glu Pro Leu Met Asn Leu Leu Thr Ala Lys 
385 390 

GAT ACC AAG ATT ATT CTG GTT ATC CTG GAT GCC ATT TCA AAT ATC TTT 
ASP ?hr i^s He He Leu Val He Leu Asp Ala He Ser Asn He Phe 

405 "^-lQ 

CAG GCT GCT GAG AAA CTA GGT GAA ACT AGC TGC CCG TCT TCA CAG ATT 
Gin Ai; Ala Glu Lys Leu Gly Glu Thr Ser Cys Pro Ser Ser Gin He 
420 -^25 ^-^u 



575 



624 



672 



72 0 



768 



alb 



864 



912 



960 



1008 



1056 



1104 



1152 



12 00 



1248 



1296 
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C.^ OOC ... .0. CAC T.C .0. .AT OAO OCO TCC OAC CCG TCO 



Gin Glu Gin Gly Lys Arg Gin Tyr Arg Asn Glu 



435 



CAG AAT AGA GAA ACT TAG TATAATGATT GAAGAATGTG GAGGCTTAGA 
Gin Asn Arg Glu Thr * 
450 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 454 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
Glu Val Asn Val Glu Leu Arg Lys Ala Lys Lys Asp Asp Gin Met Leu 

1 5 
Lys Arg Arg Asn Val Ser Ser Phe Pro Asp Asp Ala Thr Ser Pro Leu 

20 25 
Gin Glu Asn Arg Asn Asn Gin Gly Thr Val Asn Trp Ser Val Asp Asp 

35 40 
lie val Lys Gly He Asn Ser Ser Asn Val Glu Asn Gin Leu Gin Ala 

50 55 
Thr Gin Ala Ala Arg Lys Leu Leu Ser Arg Glu Lys Gin Pro Pro lie 

65 70 75 

ps^. ... Tie He Arg Ala Gly Leu He Pro Lys Phe Val Ser Phe Leu 

Gly Arg Thr Asp Cys Ser Pro He Gin Phe Glu Ser Ala Trp Ala Leu 

100 

Thr Asn He Ala Ser Gly Thr Ser Glu Gin Thr Lys Ala Val Val Asp 
115 120 125 

Gly Gly Ala He Pro Ala Phe He Ser Leu Leu Ala Ser Pro His Ala 



130 



His He ser Glu Gin Ala Val Trp Ala Leu Gly Asn He Ala Gly Asp 
145 150 155 

Gly ser Val Phe Arg Asp Leu Val He Lys Tyr Gly Ala Val Asp Pro 



1344 



1392 



CAAAATTGAA 


GCiCT AC AAA 


A C C A T G A A^ A- A. 


TGAGTCTGTG 


TATAAGGCTT 




1452 


AATTGAGAAG 


TATTTCTCTG 


TAGAGGAAGA 


GGAAGATCAA 


AACGTTGTAC 


CAGAAACTAC 


1512 


CTCTGAAGGC 


TACACTTTCC 


AAGTTCAGGA 


TGGGGCTCCT 


GGGACCTTTA 


ACTTTTAGAT 


1572 


CATGTAGCTG 


AGACATAAAT 


TTGTTGTGTA 


CTACGTTTGG 


TATTTTGTCT 


TATTGTTTCT 


1632 


CTACTAAGAA 


CTCTTTCTTA 


AATGTGGTTT 


GTTACTGTAG 


CACTTTTTAC 


ACTGAAACTA 


1692 


TACTTGAACA 


GTTCCAACTG 


TACATACATA 


CTGTATGAAG 


CTTGTCCTCT 


GACTAGGTTT 


1752 


CTAATTTCTA- 


TGTGGAATTT 


CCTATCTTGC 


AGCATCCTGT 


AAATAAACAT 


TCAAGTCCAC 


1812 
1827 


CCTTTTCTTG 


ACT TO 
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165 170 



175 



Leu Leu Ala Leu Leu Ala Val Pro Asp Met Ser Ser Leu Ala Cys Gly 



180 185 

Tyr Leu Arg Asn Leu Thr Trp Thr Leu Ser Asn Leu Cys Arg Asn Lys 
^ 195 200 205 

r.^^ m= PKo Pr-n Tie ASD Ala val Glu Gin He Leu Pro Thr Leu 
210 215 220 

val Arg Leu Leu His His Asp Asp Pro Glu Val Leu Ala Asp Thr Cys 

225 230 235 

Trp Ala He Ser Tyr Leu Thr Asp Gly Pro Asn Glu Arg He Gly Met 
245 250 

val val Lys Thr Gly Val Val Pro Gin Leu Val Lys Leu Leu Gly Ala 

2 6 5 I \j 



260 



<..r Glu Leu pro He Val Thr Pro Ala Leu Arg Ala He Gly Asn He 

275 280 

val Thr Gly Thr Asp Glu Gin Thr Gin Val Val lie Asp Ala Gly Ala 



290 



295 



Leu Ala val Phe Pro Ser Leu Leu Thr Asn Pro Lys Thr Asn He Gin 
305 310 315 

Lys Glu Ala Thr Trp Thr Met Ser Asn He Thr Ala Gly Arg Gin Asp 
325 330 

Gin He Gin Gin Val Val Asn His Gly Leu Val Pro Phe Leu Val Ser 
340 345 35U 

val Leu ser Lys Ala Asp Phe Lys Thr Gin Lys Glu Ala Val Trp Ala 
355 360 365 

val Thr Asn Tyr Thr Ser Gly Gly Thr Val Glu Gin He Val Tyr Leu 
370 375 380 

val His cys Gly He He Glu Pro Leu Met Asn Leu Leu Thr Ala Lys 

385 390 395 

ASP Thr Lys He He Leu Val He Leu Asp Ala He Ser Asn He Phe 

405 "^lO 

Gin Ala Ala Glu Lys Leu Gly Glu Thr Ser Cys Pro Ser Ser Gin He 
420 425 430 

Gin Glu Gin Gl^^ Lys Arg Gin Tyr Arg Asn Glu Ala Ser Glu Ala Ser 

440 445 



435 

Gin Asn Arg Glu Thr * 
450 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 259 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA 
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(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

GAACGACCAA GAGGGTGTTC GACTGCTAGA GCCGAGCAGA AGCGTGCCTA AATCAAAGGA 

ACTTGTTTCT TCAAGCTCTT CTGGCAGTGA TTCTGACAGT GAGGTTGACA AAAAGTTAAG 

-^rr^A^arAAA CA'!^'^'^GI^<"^0 GTGAGACTTC 

CAGGAAAAAG CAAGTTGCTC CAGAaAAaC^ iv^iA^AGAAA GA 

GAGAGCCCTG TCATCTTCTA AACAGAGCAG CAGGAGCAGA GATGATAACA TGTTTCAGAT 
TGGGAAAATG AGGTCAGTT 
(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 221 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
TGTCGACTGT GGCTTTGAGC ATCCGTCAGA AGTCCAGCAT GAGTGCATCC CTCAGGCCAT 
TCTGGGAATG GATGTCCTGT GCCAGGCCAA GTCGGGCATG GGAAAGACAG CAGTGTTTGT 
CTTGGCCACA CTGCAACAGC TGGAGCCAGT TACTGGGCAG GTGTCTGTAC TGGTGATGTG 
TCACACTCGG GAGTTGGCTT TTCAGATCAG CAAGGAATAT G 
(2) INFORMATION FOR SEQ ID NO: IS: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 372 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
ATTTGTAAAC CCCGGAGCGA GGTTCTGCTT ACCCGAGGCC GCTGCTGTGC GGAGACCCCC 
GGGTGAAGCC ACCGTCATCA TGTCTGACCA GGAGGCAAAA CCTTCAACTG AGGACTTGGG 
GGATAAGAAG GAAGGTGAAT ATATTAAACT CAAAGTCATT GGACAGGATA GCAGTGAGAT 
TCACTTCAAA GTGAAAATGA CAACACATCT CAAGAAACTC AAAGAATCAT ACTGTCAAAG 
ACAGGGTGTT CCAATGAATT CACTCAGGTT TCTCTTTGAG GGTCAGAGAA TTGCTGATAA 
TCATACTCCA AAAGAACTGG GAATGGAGGA AGAAGTTGTG ATTGAAGTTT ATCAGGAACA 
AACGGGGGGT CA 

(2) INFORMATION FOR SEQ ID NO: 19: 



60 
120 
180 
2 40 
259 



60 
120 
180 
221 



120 
180 
240 
300 
360 
372 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2675 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 104.. 2311 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
TCTGACCCTC GTCCCGCCCC CGCCATTCGC CGCCTCCTCC TGTCCCGCAG TCGGCGTCCA 
GCGGCTCTGC TTGTTCGTGT GTGTGTCGTT GCAGGCCTTA TTC ATG GGC TCA CCG 



1 



r^r^r, raa GTG GTA CTG GTC ACC GGC GCG GGG GCA GGA 

^xv^ r^^^ ^^^^ ^^^^ ^^^^^ ^-1^ p^i^ Giy 

Leu Arg Phe Asp Giy ary vdx vclx ...^ --^ . - 

5 10 15 

TTG GGC CGA GCC TAT GCC CTG GOT TTT GCA GAA AGA GGA GCG TTA GTT 
ITu Gly Arg Ala Tyr Ala Leu Ala Phe Ala Glu Arg Gly Ala Leu Val 
25 30 

GTT GTG AAT GAT TTG GGA GGG GAC TTC AAA GGA GTT GGT AAA GGC TCC 
?S vTl Lp Leu Gly Gly Asp Phe Lys Gly Val Gly Lys Gly Ser 

40 45 

TTA GOT GAT AAG GTT GTT GAA GAA ATA AGA AGG AGA GGT GGA AAA GCA 
Leu Ala Asp Lys Val Val Glu Glu He Arg Arg Arg Gly Gly Lys Ala 

55 ^0 

GTG GCC AAC TAT GAT TCA GTG GAA GAA GGA GAG AAG GTT GTG AAG ACA 
vTl A?a Asn Tyr Asp Ser Val Glu Glu Gly Glu Lys Val Val Lys Thr 
70 75 80 

GCC CTG GAT GCT TTT GGA AGA ATA GAT GTT GTG GTC AAC AAT GCT GGA 
A?a Leu Isl Ala Phe Gly Arg He Asp Val Val Val Asn Asn Ala Gly 
85 90 95 

ATT CTG AGG GAT CAT TCC TTT GCT AGG ATA AGT GAT GAA GAC TGG GAT 
lie Leu Arg Asp His Ser Phe Ala Arg He Ser Asp Glu Asp Trp Asp 
105 110 115 

ATA ATC CAC AGA GTT CAT TTG CGG GGT TCA TTC CAA GTG ACA CGG GCA 
He He His Arg Val His Leu Arg Gly Ser Phe Gin Val Thr Arg Ala 
120 125 -3- 

GCA TGG GAA CAC ATG AAG AAA CAG AAG TAT GGA AGG ATT ATT ATG ACT 
Ala Trp Glu His Met Lys Lys Gin Lys Tyr Gly Arg He He Met Thr 
135 140 145 

TCA TCA GCT TCA GGA ATA TAT GGC AAC TTT GGC CAG GCC AAT TAT AGT 
ser Ser Ala Ser Gly He Tyr Gly Asn Phe Gly Gin Ala Asn Tyr Ser 
150 155 160 

GCT GCA AAG TTG GGT CTT CTG GGC CTT GCA AAT TCT CTT GCA ATT GAA 

Ala Ala Lvs Leu Gly Leu Leu Gly Leu Ala Asn Ser Leu Ala He Giu 

165 ^ 170 175 

GGC AGG AAA AGC AAC ATT CAT TGT AAC ACC ATT GCT CCT AAT GCG GGA 



60 
115 

163 

211 

259 

307 

355 

403 

451 

499 

547 

595 

643 
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Gly Arq Lys Ser Asn He His Cys Asn Thr He Ala Pro Asn Ala Gly 
185 190 195 

TCA COG ATG ACT CAG ACA GTT ATG CCT GAA GAT CTT GTG GAA GCC TTG 739 
*""r Ar'" Me^ t^^i^ G^^ tvii- \;ai Pro 0)\j Asp Leu Val Glu Ala Leu 

' ^ 200 ' ^ 205 210 

AAG CCA GAG TAT GTG GCA CCT CTT GTC CTT TGG CTT TGT CAC GAG AGT 787 
Lys Pro Glu Tyr Val Ala Pro Leu Val Leu Trp Leu Cys His Glu Ser 
215 220 225 

TGT GAG GAG AAT GGT GGC TTG TTT GAG GTT GGT GCA GGA TGG ATT GGA 83 5 

Cys Glu Glu Asn Gly Gly Leu Phe Glu Val Gly Ala Gly Trp He Gly 
230 235 240 

AAA TTA CGC TGG GAG CGG ACT CTT GGA GCT ATT GTA AGA CAA AAG AAT 883 
Lvs Leu Arq Trp Glu Arg Thr Leu Gly Ala He Val Arg Gin Lys Asn 
245 250 255 260 

CAC CCA ATG ACT CCT GAG GCA GTC AAG GCT AAC TGG AAG AAG ATC TGT 931 
His Pro Met Thr Pro Glu Ala Val Lys Ala Asn Trp Lys Lys He Cys 

^, ^ ^ -7 n 9 7 R 



r^T\r^ rprprp TV 
\jC\'^ J. X -L v?^- ' 



^;^G AAT GCC AGC AAG CCT CAG AGT ATC CAA GAA TCA ACT GGC ^/^ 
Asp Phe Glu Asn Ala Ser Lys Pro Gin Ser He Gin Glu Ser Thr Gly 
280 285 290 

AGT ATA ATT GAA GTT CTG AGT AAA ATA GAT TCA GAA GGA GGA GTT TCA 102 7 

ser He He Glu Val Leu Ser Lys He Asp Ser Glu Gly Gly Val Ser 
295 300 305 

GCA AAT CAT ACT AGT CGT GCA ACG TCT ACA GCA ACA TCA GGA TTT GCT 107 5 

Ala Asn His Thr Ser Arg Ala Thr Ser Thr Ala Thr Ser Gly Phe Ala 
310 315 320 

GGA GCT ATT GGC CAG AAA CTC CCT CCA TTT TCT TAT GCT TAT ACG GAA 112 3 

, T,,^ T^,, T^^r-. TDv.fi tt^v- Ala Tvr Thr 

325 330 335 340 

CTG GAA GCT ATT ATG TAT GCC CTT GGA GTG GGA GCG TCA ATC AAG GAT 1171 
Leu Glu Ala He Met Tyr Ala Leu Gly Val Gly Ala Ser He Lys Asp 
345 350 355 

CCA AAA GAT TTG AAA TTT ATT TAT GAA GGA AGT TCT GAT TTC TCC TGT 1219 
Pro Lys Asp Leu Lys Phe He Tyr Glu Gly Ser Ser Asp Phe Ser Cys 
360 365 370 

TTG CCC ACC TTC GGA GTT ATC ATA GGT CAG AAA TCT ATG ATG GGT GGA 12 67 

Teu Pro Thr Phe Gly Val He He Gly Gin Lys Ser Met Met Gly Gly 
375 380 385 

GGA TTA GCA GAA ATT CCT GGA CTT TCA ATC AAC TTT GCA AAG GTT CTT 1315 
Gly JLteu M-La U-LU ±J_tr r-iu vj^j-y jjtrU ofcj- x.j^cr ^v^^ x^^^ 

390 395 400 

CAT GGA GAG CAG TAC TTA GAG TTA TAT AAA CCA CTT CCC AGA GCA GGA 13 63 

His Gly Glu Gin Tyr Leu Glu Leu Tyr Lys Pro Leu Pro Arg Ala Gly 
405 410 415 420 

AAA TTA AAA TGT GAA GCA GTT GTT GCT GAT GTC CTA GAT AAA GGA TCC 1411 
Lvs Leu Lys Cvs Glu Ala Val Val Ala Asp Val Leu Asp Lys Gly Ser 
425 430 435 

GGT GTA GTG ATT ATT ATG GAT GTC TAT TCT TAT TCT GAG AAG GAA CTT 1459 
Gly Val Val He He Met Asp Val Tyr Ser Tyr Ser Glu Lys Glu Leu 
440 445 450 
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ATA TCP CAC AAT CAG TTC TCT CTC TTT CTT GTT GGC TCT GGA GGC TTT 
Tie Cys H^s Asn Gin Phe Ser Leu Phe Leu Val Gly Ser Gly Gly Phe 
455 460 465 

„ -.^ rr^r HAT '""t^A GC^ CTA GCC ATA COT 

GOT GGA AAA CGG ACA TCA GAG G.C AAG ^.A ^C- G « _ 

Gly Gly Lys Arg Thr Ser Asp Lys Val Lys Val Ala Val Ala He 

470 475 480 

r-r^-v G^T GPT GTA CTT ACA GAT ACC ACC TCT CTT AAT CAG 
Tsn Arg Tr^ A^sp Ala Val Leu Thr Asp Thr Thr Ser Leu Asn G.n 

485 490 49b 

rCT GCT TTG TAG CGC CTC AGT GGA GAC CGG AAT CCC TTA CAC ATT GAT 
A^a 111 leu ryr Arg Leu Ser Gly Asp Arg Asn Pro Leu H.s I e Asp 
505 510 = 

CCT AAC TTT GCT AGT OTA GCA GGT TTT GAC AAG CCC ATA TTA CAT GGA 
pro III Ala Ser Leu Ala Gly Phe Asp Lys Pro He Leu H.s Gly 

520 525 

TTA TGT ACA TTT GGA TTT TCT GCC AGG CGT GTG TTA CAG CAG TTT GCA 

Leu Cys Thr Phe Gly Phe Ser Ala Arg Arg v^x ^eu o.,, ^ 

535 540 

GAT AAT GAT GTG TCA AGA TTC AAG GCA GTT AAG GCT CGT TTT GCA AAA 
A^p Asn ASP val Ser Arg Phe Lys Ala Val Lys Ala Arg Phe Ala Lys 
550 555 560 

CCA GTA TAT CCA GGA CAA ACT CTA CAA ACT GAG ATG TGG AAG GAA GGA 
iro val ryr Pro Gly Gin Thr Leu Gin Thr Glu Met Trp Lys Glu Gly 
565 570 575 

AAC AGA ATT CAT TTT CAA ACC AAG GTC CAA GAA ACT GGA GAC ATT GTC 
^n irg iL His Phe Gin Thr Lys Val Gin Glu Thr Gly Asp lie Val 
585 590 

.TT TCA AAT GCA TAT GTG GAT CTT GCA CCA ACA TCT GGT ACT TCA GCT 
He Ser Asn Ala Tyr vax Hsp j^eu A^a f^-^ ^..^ ' " sir 



600 



605 



AAG ACA CCC TCT GAG GGC GGG AAG CTT CAG AGT ACC TTT GTA TTT GAG 
L?s tS? pro sJr G?u Gly Gly Lys Leu Gin Ser Thr Phe Val Phe Glu 
615 620 &25 

GAA ATA GGA CGC CGC CTA AAG GAT ATT GGG CCT GAG GTG GTG AAG AAA 
G?J He fly Arg Arg Leu Lys Asp He Gly Pro Glu Val Val Lys Lys 
630 635 640 

GTA AAT GCT GTA TTT GAG TGG CAT ATA ACC AAA GGC GGA AAT ATT GGG 
val Ala vS Phe Glu Trp His He Thr Lys Gly Gly Asn He Gly 



^ 6^n 655 



CCT AAG TGG ACT ATT GAC CTG AAA AGT GGT TCT GGA AAA GTG TAC CAA 
Ala Ks 1% Thr He Asp Leu Lys Ser Gly Ser Gly Lys Val Tyr Gin 
^ 665 670 675 

CGC CCT GCA AAA GGT GCT GCT GAT ACA ACA ATC ATA CTT TCA GAT GAA 
fly pro A^a lys Gly Ala Ala Asp Thr Thr He He Leu Ser Asp Glu 

680 685 

GAT TTC ATG GAG GTG GTC CTG GGC AAG CTT GAC CCT CAG AAG GCA TTC 
Asp Phe Met Glu Val Val Leu Gly Lys Leu Asp Pro Gin Lys Ala Phe 
695 700 705 

— AGT GGC AGG CTG AAG GCC AGA GGG AAC ATC ATG CTG AGC CAG AAA 
Ph; Se^ G?v Arg l;u Lys Ala Arg Gly Asn He Met Leu Ser Gin Lys 
710 ^ 715 720 



1507 



1555 



1603 



1651 



1747 



1795 



1843 



1891 



1939 



1987 



2035 



2083 



!179 



2227 



2275 
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CTT CAG ATG ATT CTT AAA GAC TAG GCC AAG CTC TGA AGGGCACACT 
Leu Gin Met He Leu Lys Asp Tyr Ala Lys Leu ' 

725 730 735 

,^_,„„, ^ ^r^r^n^r'^'^CA C'^'^^A'^T^ATG CTTGATTATT 

ACACTATTAA TAAAAATGGA ATCAi irtHrti n^iCxC^^^A 

CTGCAAAAGT GATTAGAACT AAGATGCAGG GGAAATTGCT TAACATTTTC AGATATCAGA 

— - ^'v^-vni.^^'V'V PTAGTAATTT TTCATGTATC ATTATTTTTA CAAGGAACTA 

TATATAAGCT AGCACATAAT TATCCTTCTG TTCTTAGATC TGTATCTTCA TAATAAAAAA 

ATTTTGCCCA AGTCCTGTTT CCTTAGAATT TGTGATAGCA TTGATAAGTT GAAAGGAAAA 

TTAAATCAAT AAAGGCCTTT GATACCTTTA AAAAAAAAAA AAAAAAAAAA AAAA 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 736 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: lin^^ar 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Met Gly Ser Pro Leu Arg Phe Asp Gly Arg Val Val Leu Val Thr Gly 
1 5 10 

Ala Gly Ala Gly Leu Gly Arg Ala Tyr Ala Leu Ala Phe Ala Glu Arg 



20 



Gly Ala Leu Val Val Val Asn Asp Leu Gly Gly Asp Phe Lys Gly Val 



35 



, T ^T~.l T7qi nin r:lii Tie Ara Arq Arq 
Glv Lys Gly Ser Leu Axa Msp x^y^^ ^c^^ va^ ---- . - - 

^ 55 °0 



50 



Gly Gly Lys Ala Val Ala Asn Tyr Asp Ser Val Glu Glu Gly Glu Lys 



65 



val val Lys Thr Ala Leu Asp Ala Phe Gly Arg He Asp Val Val Val 
85 90 



Asn Asn Ala Gly He Leu Arg Asp His Ser Phe Ala Arg lie Ser Asp 



100 



Glu ASP Trp Asp He He Hxs Arg Val His Leu Arg Gly Ser Phe Gin 

120 125 



11^ 



val Thr Arg Ala Ala Trp Glu His Met Lys Lys Gin Lys Tyr Giy Arg 
130 135 140 

He He Met Thr Ser Ser Ala Ser Gly He Tyr Gly Asn Phe Gly Gin 

145 150 155 

Ala Asn Tyr Ser Ala Ala Lys Leu Gly Leu Leu Gly Leu Ala Asn Ser 
165 170 

Leu Ala He Glu Gly Arg Lys Ser Asn He His Cys Asn Thr He Ala 
180 185 190 

Pro Asn Ala Gly Ser Arg Met Thr Gin Thr Val Met Pro Glu Asp Leu 
195 200 ^0^ 



2321 

2 3 81 
2 441 
2501 
2 5 61 
2621 
2675 
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Val Glu Ala Leu Lys Pro Glu Tyr Val Ala Pro Leu Val Leu Trp Leu 
210 215 220 

Cys His Glu Ser Cys Glu Glu Asn Gly Gly Leu Phe Glu Val Gly Ala 
225 230 235 240 

Gly Trp lie Gly Lys Leu Arg Trp Glu Arg Thr Leu Gly Ala lie Val 
245 250 255 

Arg Gin Lys Asn His Pro Met Thr Fru Glu Ala Val Lys Ala Asn Trp 
260 265 270 

Lys Lys lie Cys Asp Phe Glu Asn Ala Ser Lys Pro Gin Ser lie Gin 
275 280 285 

Glu Ser Thr Gly Ser lie lie Glu Val Leu Ser Lys lie Asp Ser Glu 
290 295 300 

Gly Gly Val Ser Ala Asn His Thr Ser Arg Ala Thr Ser Thr Ala Thr 
305 310 315 320 

Ser Gl'^ Phe AAa Gl'- A. la Tie Gly Gin Lys Leu Pro Pro Phe Ser Tyr 
325 ^ 330 335 

Ala Tyr Thr Glu Leu Glu Ala lie Met Tyr Ala Leu Gly Val Gly Ala 
340 345 350 

Ser lie Lys Asp Pro Lys Asp Leu Lys Phe lie Tyr Glu Gly Ser Ser 
355 360 365 

Asp Phe Ser Cys Leu Pro Thr Phe Gly Val lie He Gly Gin Lys Ser 
370 375 380 

Met Met Gly Gly Gly Leu Ala Glu He Pro Gly Leu Ser lie Asn Phe 
385 390 395 400 

Ala Lvs Val Leu His Glv Glu Gin Tyr Leu Glu Leu Tyr Lys Pro Leu 
405 ' ' 410 ^ * 415 

Pro Arg Ala Gly Lys Leu Lys Cys Glu Ala Val Val Ala Asp Val Leu 
420 425 430 

Asp Lys Gly Ser Gly Val Val He He Met Asp Val Tyr Ser Tyr Ser 
435 440 445 

Glu Lys Glu Leu He Cys His Asn Gin Phe Ser Leu Phe Leu Val Gly 
450 455 460 

Ser Gly Glv Phe Gly Gly Lys Arg Thr Ser Asp Lys Val Lys Val Ala 
465 ^ 470 475 480 

Val Ala He Pro Asn Arg Pro Pro Asp Ala Val Leu Thr Asp Thr Thr 
485 490 495 

Ser Leu Asn Gin Ala Ala Leu Tyr Arg Leu Ser Gly Asp Arg Asn Pro 
500 505 510 

Leu His He Asp Pro Asn Phe Ala Ser Leu Ala Gly Phe Asp Lys Pro 
515 520 525 

He Leu His Gly Leu Cys Thr Phe Gly Phe Ser Ala Arg Arg Val Leu 
530 535 540 

Gin Gin Phe Ala Asp Asn Asp Val Ser Arg Phe Lys Ala Val Lys Ala 
545 550 555 560 

Arg Phe Ala Lys Pro Val Tyr Pro Gly Gin Thr Leu Gin Thr Glu Met 
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# 



565 570 575 

Trp Lys Glu Gly Asn Arg Tie His Phe Gin Thr Lys Val Gin Glu Thr 
580 585 590 

Gly Asp lie Val lie Ser Asn Ala Tyr Val Asp Leu Ala Pro Thr Ser 
595 600 605 

Gly Thr Ser Ala Lys Thr Pro Ser Glu Gly Gly Lys Leu Gin Ser Thr 

^ 1 /-^ /-- T r- r r\ 

DJ.U DJ.D UilW 

Phe Val Phe Glu Glu lie Gly Arg Arg Leu Lys Asp lie Gly Pro Glu 
625 630 635 640 

Val Val Lys Lys Val Asn Ala Val Phe Glu Trp His lie Thr Lys Gly 
645 650 655 

Gly Asn lie Gly Ala Lys Trp Thr lie Asp Leu Lys Ser Gly Ser Gly 
660 665 670 

Lys Val Tyr Gin Gly Pro Ala Lys Gly Ala Ala Asp Thr Thr lie lie 

675 680 685 

Leu ser Asp Glu Asp Phe Mer Glu Val Val Leu Gly Lys Leu Asp Pro 
690 695 700 

Gin Lys Ala Phe Phe Ser Gly Arg Leu Lys Ala Arg Gly Asn lie Met 
705 710 715 720 

Leu Ser Gin Lys Leu Gin Met lie Leu Lys Asp Tyr Ala Lys Leu * 
725 730 735 
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